Bayesian inference and experimental design for large generalised linear models
نویسنده
چکیده
Decision making in light of uncertain and incomplete knowledge is one of the central themes in statistics and machine learning. Probabilistic Bayesian models provide a mathematically rigorous framework to formalise the data acquisition process while making explicit all relevant prior knowledge and assumptions. The resulting posterior distribution represents the state of knowledge of the model and serves as the basis for subsequent decisions. Despite its conceptual clarity, Bayesian inference computations take the form of analytically intractable high-dimensional integrals in practise giving rise to a number of randomised and deterministic approximation techniques. This thesis derives, studies and applies deterministic approximate inference and experimental design algorithms with a focus on the class of generalised linear models (GLMs). Special emphasis is given to algorithmic properties such as convexity, numerical stability, and scalability to large numbers of interacting variables. After a review of the relevant background on GLMs, we introduce the most promising approaches to estimation, approximate inference and experiment design. We study in depth a particular approach and reveal its convexity properties naturally leading to a generic and scalable inference algorithm. Furthermore, we are able to precisely characterise the relationship between Bayesian inference and penalised estimation: estimation is a special case of inference and inference can be done by a sequence of smoothed estimation steps. We then compare a large body of inference algorithms on the task of probabilistic binary classification using a kernelised GLM: the Gaussian process model. Multiple empirical comparisons identify expectation propagation (EP) as the most accurate algorithm. As a next step, we apply EP to adaptively and sequentially design the measurement architecture for the acquisition of natural images in the context of compressive sensing (CS), where redundancy in signals is exploited to accelerate the measurement process. We observe in comparative experiments differences between adaptive CS results in practise and the setting studied in theory. Combining the insights from adaptive CS with our convex variational inference algorithm, we are able – by sequentially optimising Bayesian design scores – to improve the measurement sequence in magnetic resonance imaging (MRI). In our MRI application on realistic image sizes, we achieve scan time reductions for constant image quality.
منابع مشابه
Bayesian Inference for Spatial Beta Generalized Linear Mixed Models
In some applications, the response variable assumes values in the unit interval. The standard linear regression model is not appropriate for modelling this type of data because the normality assumption is not met. Alternatively, the beta regression model has been introduced to analyze such observations. A beta distribution represents a flexible density family on (0, 1) interval that covers symm...
متن کاملGaussian Kullback-Leibler approximate inference
We investigate Gaussian Kullback-Leibler (G-KL) variational approximate inference techniques for Bayesian generalised linear models and various extensions. In particular we make the following novel contributions: sufficient conditions for which the G-KL objective is differentiable and convex are described; constrained parameterisations of Gaussian covariance that make G-KL methods fast and scal...
متن کاملThe Family of Scale-Mixture of Skew-Normal Distributions and Its Application in Bayesian Nonlinear Regression Models
In previous studies on fitting non-linear regression models with the symmetric structure the normality is usually assumed in the analysis of data. This choice may be inappropriate when the distribution of residual terms is asymmetric. Recently, the family of scale-mixture of skew-normal distributions is the main concern of many researchers. This family includes several skewed and heavy-tailed d...
متن کاملSparse linear models: Variational approximate inference and Bayesian experimental design
A wide range of problems such as signal reconstruction, denoising, source separation, feature selection, and graphical model search are addressed today by posterior maximization for linear models with sparsity-favouring prior distributions. The Bayesian posterior contains useful information far beyond its mode, which can be used to drive methods for sampling optimization (active learning), feat...
متن کاملAPTS – Statistical Modelling – Preliminary Material
Linear and generalised linear models: A student who has covered Chapters 8 and 10.110.4 of Statistical Models by A. C. Davison (Cambridge University Press, 2003) will be more than adequately prepared for the APTS module. For students without access to this book, the main theory is repeated below. The inference methodology described is largely based on classical statistical theory. Although prio...
متن کامل